Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lnd+walletunlocker: make unlock/init operations synchronous #4349

Closed

Conversation

Roasbeef
Copy link
Member

@Roasbeef Roasbeef commented Jun 3, 2020

In this commit, we fix a deadlock that can happen if a user attempts to
init then rapidly unlock a wallet right after. In my profiles, it seems
the lnd gets caught up on the bbolt flock, which deadlocks the entire
process. We fix this issue by making the Init/Unlock calls now fully
synchronous. Only a single outstanding request can exist across the
entire wallet unlocker service now.

Fixes #4340.

Fixes #3631.

@Roasbeef Roasbeef added rpc Related to the RPC interface wallet The wallet (lnwallet) which LND uses bug fix v0.11 labels Jun 3, 2020
@Roasbeef Roasbeef added this to the 0.11.0 milestone Jun 3, 2020
@Roasbeef Roasbeef requested review from wpaulino and carlaKC June 3, 2020 02:52
@Roasbeef
Copy link
Member Author

Roasbeef commented Jun 3, 2020

As is, this makes unlock/init fully synchronous. If we want to leave it as async, then a goroutine can be added in the unlocker service to release the mutex in the background after the operation is complete.

@@ -313,6 +342,9 @@ func (u *UnlockerService) InitWallet(ctx context.Context,
func (u *UnlockerService) UnlockWallet(ctx context.Context,
in *lnrpc.UnlockWalletRequest) (*lnrpc.UnlockWalletResponse, error) {

u.Lock()
defer u.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may also need another signal/bool at the UnlockerService level that indicates the wallet has been initialized/unlocked so that we can check it here and everywhere else, otherwise it seems like we still risk attempting to open the wallet twice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? Push things down even further? With where things are atm, we won't return back to the caller until the wallet has been fully initialized. However, for unlock we return a bit earlier once we have all the credentials. In my testing, the concurrent init was was ended up tripping things up.

For unlock, things only become "fully finalized" once we create the chain control.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, as soon as this method returns, unlocking isn't even possible since the listener service only has a lifetime of this call.

walletunlocker/service.go Show resolved Hide resolved
walletunlocker/service_test.go Show resolved Hide resolved
In this commit, we fix a deadlock that can happen if a user attempts to
init then rapidly unlock a wallet right after. In my profiles, it seems
the lnd gets caught up on the bbolt flock, which deadlocks the entire
process. We fix this issue by making the Init/Unlock calls now fully
synchronous. Only a single outstanding request can exist across the
entire wallet unlocker service now.

Fixes lightningnetwork#4330.

Fixes lightningnetwork#3631.
@@ -472,7 +504,20 @@ func TestChangeWalletPassword(t *testing.T) {
t.Fatalf("expected to receive password %x, got %x",
testPassword, unlockMsg.Passphrase)
}

// We'll now close the done channel to unlock the service to be
// able to accept another request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from your comment on the pr it seems the unlocker can only process one request, but several comments in the code refer to processing more requests. is it the case that the wallet unlocker could process more requests, but we only allow one due to the defer cancel() in waitForWalletPassword?

@@ -302,7 +323,19 @@ func (u *UnlockerService) InitWallet(ctx context.Context,
initMsg.ChanBackups = *chansToRestore
}

u.InitMsgs <- initMsg
select {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add helper to avoid having 3 copies of the same code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guess there are really two versions, one for init and one for unlock...

u.UnlockMsgs <- walletUnlockMsg
select {
case u.UnlockMsgs <- walletUnlockMsg:
case <-u.quitChan:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is never closed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, the main shutdown chan is passed to it.

@@ -1124,6 +1130,13 @@ func waitForWalletPassword(cfg *Config, restEndpoints []net.Addr,
// The wallet has already been created in the past, and is simply being
// unlocked. So we'll just return these passphrases.
case unlockMsg := <-pwService.UnlockMsgs:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was still able to reproduce the issue with the current changes. The issue doesn't seem to be related to the init call being asynchronous, but rather with the UnlockerService having a queued UnlockWallet call that it doesn't cancel after InitWallet has been called. If we had a UnlockerService.Quit method, we could call it here and check that it's been closed after the lock has been acquired at the UnlockerService level.

@Roasbeef Roasbeef modified the milestones: 0.11.0, 0.12.0 Jul 2, 2020
@Roasbeef Roasbeef removed the v0.11 label Jul 2, 2020
@Roasbeef Roasbeef modified the milestones: 0.12.0, 0.13.0 Nov 4, 2020
@Roasbeef Roasbeef added P2 should be fixed if one has time P1 MUST be fixed or reviewed labels Jan 28, 2021
@halseth halseth removed the P1 MUST be fixed or reviewed label Feb 25, 2021
@Roasbeef
Copy link
Member Author

Replaced by #4985

@Roasbeef Roasbeef closed this Mar 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fix P2 should be fixed if one has time rpc Related to the RPC interface v0.13 wallet The wallet (lnwallet) which LND uses
Projects
None yet
5 participants